Using Prefix-Trees for Efficiently Computing Set Joins

نویسندگان

Ravi Jampani

Vikram Pudi

چکیده

Joins on set-valued attributes (set joins) have numerous database applications. In this paper we propose PRETTI (PREfix Tree based seT joIn) – a suite of set join algorithms for containment, overlap and equality join predicates. Our algorithms use prefix trees and inverted indices. These structures are constructed on-the-fly if they are not already precomputed. This feature makes our algorithms usable for relations without indices and when joining intermediate results during join queries with more than two relations. Another feature of our algorithms is that results are output continuously during their execution and not just at the end. Experiments on real life datasets show that the total execution time of our algorithms is significantly less than that of previous approaches, even when the indices required by our algorithms are not precomputed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Similarity Joins in Relational Database Systems

State-of-the-art database systems manage and process a variety of complex objects, including strings and trees. For such objects equality comparisons are often not meaningful and must be replaced by similarity comparisons. is book describes the concepts and techniques to incorporate similarity into database systems. We start out by discussing the properties of strings and trees, and identify t...

متن کامل

Set containment joins using two prefix trees (Exposé)

A common example (see Mamoulis, 2003) of a set containment problem is the matching of people, people’s skills, jobs and job skills. The way to answer “Which people are matching which job?” depends on the used data structures. In relational databases set attributes are usually modeled by normalized mapping using a separate relation for every single set attribute. People and their skills are then...

متن کامل

Applying Segmented Right-Deep Trees to Pipelining Multiple Hash Joins

The pipelined execution of multijoin queries in a multiprocessor-based database system is explored in this paper. Using hash-based joins, multiple joins can be pipelined so that the early results from a join, before the whole join is completed, are sent to the next join for processing. The execut ion of a query is usually denoted by a query execution tree. To improve the execution of pipelined ...

متن کامل

Plug&Join: An easy-to-use Generic Algorithm for Efficiently Processing Equi and Non-Equi Joins

This paper presents Plug&Join, a new generic algorithm for efficiently processing a broad class of different types of joins in an extensible database system. Plug&Join is not only designed to support equi joins, temporal joins, spatial joins, subset joins and other types of joins, but in contrast to previous algorithms it can be easily customized and it allows efficient processing of new types ...

متن کامل

PEL: Position-Enhanced Length Filter for Set Similarity Joins

Set similarity joins compute all pairs of similar sets from two collections of sets. Set similarity joins are typically implemented in a filter-verify framework: a filter generates candidate pairs, possibly including false positives, which must be verified to produce the final join result. Good filters produce a small number of false positives, while they reduce the time they spend on hopeless ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Using Prefix-Trees for Efficiently Computing Set Joins

نویسندگان

چکیده

منابع مشابه

Similarity Joins in Relational Database Systems

Set containment joins using two prefix trees (Exposé)

Applying Segmented Right-Deep Trees to Pipelining Multiple Hash Joins

Plug&Join: An easy-to-use Generic Algorithm for Efficiently Processing Equi and Non-Equi Joins

PEL: Position-Enhanced Length Filter for Set Similarity Joins

عنوان ژورنال:

اشتراک گذاری